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The complexity of biological, social and engineering networks makes it desirable to find natural 
partitions into communities that can act as simplified descriptions and provide insight into the 
structure and function of the overall system. Although community detection methods abound, 
there is a lack of consensus on how to quantify and rank the quality of partitions. We show here 
that the quality of a partition can be measured in terms of its stability, defined in terms of the 
clustered autocovariance of a Markov process taking place on the graph. Because the stability has 
an intrinsic dependence on time scales of the graph, it allows us to compare and rank partitions at 
each time and also to establish the time spans over which partitions are optimal. Hence the Markov 
time acts effectively as an intrinsic resolution parameter that establishes a hierarchy of increasingly 
coarser clusterings. Within our framework we can then provide a unifying view of several standard 
partitioning measures: modularity and normalized cut size can be interpreted as one-step time 
measures, whereas Fiedler's spectral clustering emerges at long times. We apply our method to 
characterize the relevance and persistence of partitions over time for constructive and real networks, 
including hierarchical graphs and social networks. We also obtain reduced descriptions for atomic 
level protein structures over different time scales. 



I. INTRODUCTION 

In recent years, there has been an explosion of interest 
in the analysis of networks as models of complex systems. 
The literature is extensive spanning areas as diverse as 
gene regulation, protein interactions and metabolic path- 
ways, neural science, social networks or engineering sys- 
tems such as transportation networks and the internet, 
to name but a few [TJ [5] • The tools for network analy- 
sis are firmly grounded on results in graph theory, with 
an influx of concepts from statistical physics, dynamical 
systems and stochastic processes Due to the large- 
scale, complex nature of many systems under study, an 
appealing idea is to obtain relevant partitions (or clus- 
terings) of the network that can reveal the underlying 
structure of the system and hence insight into its func- 
tion. These partitions could potentially lead to reduced, 
more manageable representations of the original system. 

The topic of graph community detection has a long 
history and multiple methods and heuristics have been 
proposed to partition graphs into communities or clus- 
ters. (See for instance and references therein for a re- 
cent survey.) However, the extensive list of partitioning 
methods comes with a parallel lack of theory or consen- 
sus on measures to quantify the goodness of a commu- 
nity structure. The simplest such measure is certainly 
the cut size, i.e., the sum of the weights of edges that lie 
at the boundaries of different communities. As a general 
rule, good community structures should have small cut 
size implying that the communities are weakly connected. 
Unfortunately, this simple intuitive notion has negligible 
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applicability since the partition with minimum cut size 
is often trivial. Therefore, a variety of measures have 
been proposed including, without claim of exhaustivity, 
normalized cut [5], (a, e)-clustering [B^, modularity fT', "5^ 
and variants and extensions of modularity 9 , 10 . Each of 
these methods has distinct features and has been shown 
to produce reasonable community structures for differ- 
ent examples. In particular, modularity does not require 
that the number of communities be specified in advance, 
unlike most of the other partitioning methods. However, 
it has been recently shown that optimizing modularity 
can over-partition or under-partition the network, failing 
to find the most natural community structure |llj . To 
compensate for this, recent methods PUIIT^IT^ . have in- 
cluded an ad hoc resolution parameter that can be tuned 
to bias towards small or large communities. The intro- 
duction of these resolution parameters highlights the fact 
that one would expect that any given graph would be de- 
scribed by different natural community structures (finer 
or coarser) under different conditions. 

Our work introduces a quality measure that has the 
intrinsic flexibility to flnd which clusterings are relevant 
at different time scales. This is achieved by establishing 
a link between the quality of the partition and a stochas- 
tic process taking place on the clustered graph. We use 
the well-known relationship between graphs and Markov 
chains: with any unweighted graph we can associate a 
random walk in which the probability of leaving a vertex 
is uniformly distributed among the outgoing edges. This 
Markov viewpoint provides a dynamical interpretation 
of communities. In particular, natural communities at 
a given time scale will correspond to persistent dynam- 
ical basins, that is, sets of states from which escape is 
unlikely within the given time scale. This can be estab- 
lished quantitatively through the autocovariance of the 
clustered Markov process, a measure that defines the per- 
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sistence of a cluster in time. In essence, one can think of 
the time scale as an intrinsic resolution parameter for the 
clustering: over short time scales, many clusters should 
be coherent; on the other hand, the expectation is that 
there will be few persistent clusters under the action of 
the Markov chain if one waits for a long time. 

An important feature of our approach is that it pro- 
vides a framework that unifies several heuristic measures. 
It turns out that most quality measures introduced in 
the literature have a natural Markov probabilistic in- 
terpretation. We will show below that modularity and 
normalized min-cut are related to the autocovariance on 
paths of length one (i.e., at time t = 1), while Fiedler's 
spectral method corresponds to the limit of long paths 
(i.e, time t = oo). In contrast, our measure considers 
paths of all lengths and provides an evaluation of the 
quality of a clustering at all times, including fractional 
times (0 < i < 1) for which we obtain clusterings finer 
than those obtained by modularity optimization. Our 
measure is thus not affected by the resolution limit of 
modularity [13]. 

The rest of the paper proceeds by introducing in sim- 
ple terms the definition of the stability of the clustering 
r(t), which corresponds to the autocovariance of the par- 
tition under a Markov process and provides a measure 
of the quality of any partition over time. As part of our 
derivation, we show that r(l) is optimized by modularity 
while at long time scales, r(oo) is typically optimal for 
the classic 2- way spectral clustering related to the Fiedler 
vector. For the intermediate time scales, our measure can 
be used to rank the different partitions and, in doing so, 
establish a hierarchical, time-dependent set of partitions 
that are valid over different time spans. Our measure 
also allows us to compare the community structure ob- 
tained by different algorithms over different timescales. 
In addition, we show how the stability at fractional times, 
r(0 < t < 1), leads to finer partitions than those pro- 
duced by modularity maximization. Therefore the sta- 
bility r(t) provides a unifying framework for the under- 
standing of different and seemingly unrelated clustering 
heuristics in relation to the characteristic Markov time 
over which a given clustering is valid. We exemplify the 
applications of the method with networks drawn from dif- 
ferent fields to showcase the generality of the approach. 



II. METHODS 

A. Autocovariance and stability of a graph 
partition 

Consider an undirected, connected graph with N ver- 
tices and E edges and assume that the graph is non- 
bipartite. For simplicity in the derivation below, we will 
assume that the graph is unweighted, although all our 
results apply equally to weighted graphs. The topology 
of the graph is given by the N x N adjacency matrix A, 
a symmetric 0-1 matrix with a 1 if two vertices are con- 



nected and otherwise. The number of edges of each 
vertex, or degree, di can be compiled into the vector 
d = Al, where 1 is the vector of ones. We will also 
use the diagonal matrix of degrees: D — diag(d). 

A random walk on any such graph defines an associ- 
ated Markov chain in which the probability of leaving a 
vertex is split uniformly among the outgoing edges, with 
a transition probability 1 /di for each edge: 

Pt+i - p* [D-'A] = ptM, (1) 

where pt is the (normalized) probability vector and M is 
the transition matrix. Under these assumptions, we have 
an ergodic and reversible Markov chain with stationary 
distribution tt = d^/^^iii = d'^/2E. We will also use 
below the diagonal matrix 11 = diag(7r). 

Consider now a given partition of the graph in c com- 
munities. This (hard) clustering can be encoded in an 
N X c indicator matrix H, a 0-1 matrix that records 
which vertex belongs to which community. Each row 
of H is all zeros except for a 1 indicating the cluster 
to which the vertex belongs. Let us now observe the 
Markov process ([T]) under the prism of the given parti- 
tion by assigning a different real value to the vertices 
of each of the c clusters. The observed signal is then 
a stationary, not necessarily Markovian, random vari- 
able {Xt)t<£N which consists of a sequence of at. The 
expectation for a good partition of the graph over a 
given time scale is that the state is likely to remain 
within the starting cluster for such a time span, as com- 
pared with that event occurring at random. This can be 
quantified through the autocovariance of the observable 
cov[Xt, Xt+r] = E[XtXt+r]-'E.[Xt]'^ , where E denotes ex- 
pectation. If the inter-community connections are weak, 
the values of Xt and Xt+r will be correlated for longer 
times. How fast the autocovariance decays as a function 
of the lag r is therefore an indicator of the quality of 
the clustering over the corresponding Markov time scale. 
This is the main idea underpinning our measure. 

The covariance of Xt can be rewritten as 
cov[Xt, Xf^T-] = a^R^a, where a is the vector of 
labels of the c communities and the matrix Rt is the 
clustered autocovariance matrix of the graph: 

Rt ^ (HAf* - TT^vr) H. (2) 

Note that the matrix Rt depends only on properties of 
the graph and clustering. It summarizes the i-step depen- 
dence of the transfer probabilities between clusters: each 
element {Rt)ij corresponds to the probability of start- 
ing in a cluster i and being in another cluster j after t 
steps minus the probability that two independent random 
walkers are in i and j, evaluated at stationarity. 

As stated above, a good partition over a given time 
scale should imply a high likelihood of remaining within 
the starting community. In terms of the clustered au- 
tocovariance matrix, the diagonal elements {Rt)ii, which 
measure the probability of a random path of length t to 
start and end in the same community, should be larger 
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than the ofF-diagonal ones. This leads to our definition 
of the stability of the clustering: 
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A good clustering over time t will have large stability, 
with a large trace of Rt over such a time span. Note that 
our definition involves the minimum value of the trace in 
a given interval, i.e., the stability of the partition is large 
only if the values for all times up to t are large. In this 
way, we assign low stability to partitions where there is 
a high probability of leaving the community and coming 
back to it later, as in the case of almost bipartite graphs. 

The stabihty ([s]) is the fundamental tool we propose to 
assess the quality of different clusterings over time. For 
each candidate clustering, we can compute the stability 
at all times and rank the possible partitions. Clearly, cer- 
tain partitions might only be optimal in particular time 
windows and different partitions will be optimal at differ- 
ent times. For each Markov time t, we seek the partition 
with the largest stability to obtain the stability curve of 
the graph: r{t) — max/f r{t; H). This curve establishes a 
time hierarchy of partitions, from finer to coarser as time 
grows, as shown in Figure [T] for a social network. This 
underscores the idea that partitions are better or worse 
depending on the time of interest, and the concept of the 
Markov time as an intrinsic resolution parameter that 
establishes when a partition is good. In this sense, the 
most relevant partitions will be optimal over long time 
windows, because they serve as good representations over 
extended time scales of the system. 



B. Relationship of the stability with modularity, 
cut, normalized cut and spectral partitioning 

An important feature of the stability (|3| is that it en- 
compasses several of the criteria for clustering in the liter- 
ature and allows us to interpret those heuristics in terms 
of the relevant Markov time scales of the graph. To ex- 
plore this, we study the autocovariance Rt and the sta- 
bility r(t) in different limits. 

First, consider short times. At time t — 0, the partition 
with the largest stability is the finest possible clustering. 
This follows from the definition r{0) — 1~ \ \ttH\\2, which 
becomes maximal when each vertex is in its own cluster 
as follows from elementary inequalities. 

At time t = 1, we recover modularity, a popular mea- 
sure for community detection |7 . It follows from the defi- 
nition that modularity is equal to the trace of Ri , the au- 
tocovariance matrix at time t = I. Therefore, maximiz- 
ing r(l) is equivalent to modularity optimization. (See 
also [Hj for an alternative, non-dynamical take on this 
issue.) The stability is also related to other measures in 
the hterature. Consider the cut size (Cut), defined as 
the sum of the number of inter-community edges divided 
by the total number of edges of the graph. It is easy to 
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FIG. 1: (A) Largest connected component of a graph of scien- 
tific collaborations in network science [14] . The vertices corre- 
spond to = 379 researchers indexed by the 21-way partition 
obtained by maximizing the stability at t = 1 (or equivalently, 
modularity). A list of names for this graph and groupings is 
available in the Supplementary Information. (B) Stability 
curve obtained with the divisive KVV algorithm (top) and 
the corresponding dendrogram of the hierarchy of partitions 
(bottom) . Note the simplicity of the dendrogram, which is not 
a binary tree, as compared with the many branching points 
obtained by standard binary partition methods. Only two 
clusterings are long-lived: the two-way clustering (trivially) 
and the five-way partition represented by areas shaded in dif- 
ferent colors in (A). 



see that Cut — r(0) — r(l). Hence modularity is equal to 
1 — Cut — ||7rH|||, and maximizing modularity is equiva- 



lent to minimizing Cut -|- UttTJUj. This is the reason why 
modularity tends to produce balanced partitions: mini- 
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mizing Cut favors few clusters, possibly of very unequal 
sizes, while minimizing ||7rif||2 tends to favor many clus- 
ters of equal size. An alternative measure to modularity 
is the so-called Normalized Cut size (NCut) [5^. For the 
case of two communities, NCut is the number of inter- 
community edges multiplied by the sum of the inverse of 
the number of edges in each community, which can be 
shown to equal NCut = p(0) — p{l), where p{t) is given 
by the same expression as the stability r{t) replacing co- 
variances by correlations. Hence NCut is also a one-step 
measure. 

The discussion above shows that modularity. Cut and 
NCut are based on the one-step behavior of the Markov 
process. On the other hand, our stability measure takes 
into account the dependence of the autocovariance at all 
times. In fact, the behavior of r(t) in the long time 
limit t ^ oo establishes a link with spectral clustering 
methods, the other standard toolbox for graph partition- 
ing. Spectral methods are generally based on the Fiedler 
eigenvector, i.e., the eigenvector associated with the sec- 
ond smallest eigenvalue of the Laplacian matrix L = 
D—A, or of the normalized Laplacian C = D~^/^LD~^/^ . 
In Fiedler's original work |15l 116] . the graph was parti- 
tioned into two subgraphs according to the sign of the 
components of the Fiedler vector. More recently, graph 
partitioning based on the normalized Fiedler vector has 
been proposed |T7] and shown to be a heuristic for the 
optimal NCut 2-way clustering [5 . 

The analysis of our measure shows that spectral clus- 
tering is not just a heuristic but an exact method to 
find the most stable partitions at long time scales. This 
follows from the spectral decomposition of the normal- 
ized Laplacian C, which is trivially related to that of 
M = D^/'^MD-^/^ = Y.'!^! A.Uiuf . Here the eigenval- 
ues Xi are ranked in order of decreasing magnitude and 
the corresponding eigenvectors are orthonormal. In 
particular, Ai = 1 and Ui = (l/\/2i?) _D^/^1 leading to 
the following asymptotic behavior: 

trace[i?,] = J] A ||,^t^i/2„j2 ±^ ^ |,^t^i/2^^|| 

1=2 

(4) 

If A2 is positive, U2 is the normalized Fiedler eigenvec- 
tor and the clustering with maximal stability at long 
times typically corresponds to the Fiedler partition. To 
see this, take initially the finest possible partition with 
each node in a cluster by itself and cluster together ver- 
tices i and j. This induces a variation in Q given by 
{\\/E)yJdidj M2,iW2j"i which is only positive if the com- 
ponents of the normalized Fiedler vector for nodes i and 
i have the same sign. Applied recursively, this leads to 
the result that the partition with the largest stability at 
long times is typically the 2-way clustering according to 
the sign of the entries of the Fiedler vector. 

When A2 is negative, U2 is not the Fiedler eigenvec- 
tor but rather the largest eigenvalue of C, i.e., the most 
negative eigenvector of A^. In this case, the dominant 
term in Q, and hence the stability, becomes negative 



for all partitions except for the coarsest clustering with 
all nodes in one community and H = 1, for which the 
stability is zero at all times, following from Q and or- 
thogonality. We thus conclude that, at large times, the 
clustering with maximal stability is either a one-way or 
two-way partition. In the latter case, it is given by the 
normalized Fiedler vector. 

The overall picture emanating from our analysis is that 
the partition with highest stability evolves from the finest 
possible (each vertex by itself) at < = 0, through the op- 
timal modularity clustering at t = 1, onto a sequence 
of coarser partitions, the last of which is typically the 
two-way spectral clustering (or the one-way trivial clus- 
tering) 'AS t ^ 00. Although the sequence of partitions is 
not necessarily always increasingly coarser at increasing 
times (we may have incomparable clusterings that are 
optimal at different times), we do expect that the clus- 
terings will roughly contain fewer and fewer clusters as 
the Markov time grows. 



III. APPLICATIONS AND EXAMPLES 

We now show the applicability of the method by an- 
alyzing three examples drawn from social interactions, 
hierarchical scale-free graphs and protein structural net- 
works. Rather than being exhaustive, our goal is to high- 
light through each example some of the wider features of 
our approach. 



A. Example 1 — Time hierarchy of partitions and 
comparison of clustering algorithms 

Our first example deals with the graph of collabora- 
tions between researchers in network science shown in 
Figure [TJA [14 . Community structures are relevant for 
social networks, where the identification of groups of peo- 
ple with strong ties can help unravel underlying patterns 
of interdependence [3] . In Figure [l]Z? we show the time 
hierarchy of partitions associated with the stability curve 
of the network. Our measure ([s]) is used to rank parti- 
tions efficiently, since the stability of a given clustering 
r{t; H) is directly computable in 0{cEt), or estimated in 
0{Kt) with accuracy 0{c/Vk) through K random walks 
of length t. In order to obtain the stability curve, one 
needs to maximize the stability over all partitions. Given 
that modularity optimization is provably NP-hard [18], 
it is likely that no efficient algorithm exists for the opti- 
mization of stability for arbitrary graphs. However, for 
all practical applications, we can still obtain sequences 
of partitions through the use of a number of partition- 
ing algorithms with different heuristic strategies, such as 
aggregative (i.e., unifying clusters from the finest cluster- 
ing) or divisive (i.e., splitting clusters from the coarsest 
clustering) . Figure [T]Z? is the result of the application of 
Kannan, Vempala and Vetta's (KVV) conductance spec- 
tral algorithm [6 under a divisive strategy to produce 
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FIG. 2: A comparison of the stability curve of the partitions 
obtained through a divisive strategy using four clustering al- 
gorithms (Shi-Malik '5' , KVV [6\ , Newman 14 and Newman- 
Girvan 7 ) on the network of scientific collaborations pictured 
in Fig.[l]f5; 



a sequence of partitions, which are then ranked accord- 
ing to their stability to estimate the stability curve r{t). 
This curve is then translated into a non-binary dendro- 
gram representing the sequence of community structures 
with maximal stability as a function of time. The dendro- 
gram has the advantage of being relatively simple, with 
fewer branching points compared with the binary trees 
produced by most hierarchical community detection algo- 
rithms. In this case, the time hierarchy of partitions indi- 
cates that the modularity-optimal clustering into 21 com- 
munities is short-lived whereas a partition into 5 commu- 
nities persists over a long time window. This suggests the 
relevance of this coarser meta-community structure as in- 
dicative of the likelihood of information to flow within the 
flve subgroups of researchers. 

Our stability measure can also be used to rank the 
sequences of partitions obtained by different algorithms 
and strategies. Figure [2] presents the comparison of the 
estimated stability curves from four algorithms chosen 
for their simplicity and popularity and because they rep- 
resent different overall methodologies. In addition to the 
KVV conductance method introduced above [6.^ , we have 
also examined Shi-Malik's recursive spectral method [5], 
Newman's spectral method to optimize modularity |14| 
and the Newman- Girvan betweenness algorithm [T. In 
all cases, we use a divisive strategy to produce a sequence 
of increasingly finer fc-way partitions and obtain an es- 
timate of the stability curve r{t) by choosing the best 
partition at each time. For details of the algorithms 
see the Supplementary Information. Figure [2] shows that 
Shi-Malik and KVV produce the partitions with highest 
stability at all shown times (alternatively better in dif- 
ferent time windows), followed closely by the Newman- 
Girvan algorithm and Newman's spectral algorithm. At 
higher times (up to t = 1000 at least), the KVV method 
slightly dominates Shi-Malik and Newman-Girvan algo- 
rithms, while Newman's clustering algorithm is worse by 
a factor of two. These observations are no evidence of 
superiority of one method over another, but an exam- 
ple of how to compare and use the different partitioning 



algorithms on a given example. 

B. Example 2 — Beyond the resolution limit of 
modularity: the small time limit of the continuous 
process 

Recently, it has been shown that modularity optimiza- 
tion cannot produce partitions smaller than a certain 
relative size. This effect, termed the resolution limit 
of modularity, leads to partitions coarser then the ex- 
pected 'natural' community structure |llj . So far, based 
on the discrete-time stability ([s]) , our analysis has shown 
that at time t — 0, the most stable community struc- 
ture corresponds to the trivial partition of each vertex 
in a community, while the modularity-optimal commu- 
nity structure corresponds to time t — 1. For t > 1, 
the most stable community structures are coarser than 
those found by modularity optimization. In order to ob- 
tain finer community structures than modularity (i.e., 
beyond the resolution limit), we must consider the sta- 
bility at times between zero and one. In fact, this regime 
can be studied within our framework through the nat- 
ural extension to the continuous-time version of Eq. (pi 
obtained through substitution of M* by exp [{M — Ijt\, 
where / is the identity matrix fl9| . Keeping linear terms 
in the small t expansion of the matrix exponential, we 
get the following approximation of the stability for small 
(continuous) times: 

rc(i) ~ (l-i)r(0)-t-ir(l), < i < 1. (5) 

Note that this linear interpolation recovers modularity 
r(l) at t = 1 and the totally unclustered graph r(0) at 
time t = 0. It also provides an interpretation in terms 
of Markov time of the resolution parameter proposed by 
Reichardt and Bornholdt [lOj and is related to a heuristic 
proposed by Arenas et al. [T2] consisting of the addition 
of weighted self-loops to the graph. 

As an example. Figure [3] shows the stability curve for 
times smaller than one of the partitions of a 125-vertex 
hierarchical scale-free graph recently proposed by Ravasz 
and Barabasi [20 . In this simple model, the natural clus- 
tering is not found through modularity. Our method, on 
the other hand, finds that the natural partitions into 25 
and 5 clusters have long windows of stability while the 
partition obtained by modularity at t = 1 is a transient 
with no extended significance. See [5T] for another dy- 
namical analysis of the same graph. 

C. Example 3: Structural graphs, model reduction 
and time scales 

Our final example shows an application of our frame- 
work to analyze graphs of atomic level protein structures 
and its relevance to model reduction of biophysical sys- 
tems. Recently, new methods based on the explicit con- 
sideration of graphs of constraints have been proposed 
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FIG. 3: Stability curve of a hierarchical, scale-free graph with 
A'^ = 125 vertices proposed in [20] (shown in the inset) cal- 
culated for times smaller and larger than one. Note that the 
natural partitions in 5 and 25 communities have a long time 
scale of stability, while the modularity-optimal clustering (at 
t = 1) can be seen as a transient. 



nities in the partition grows, one expects that A will de- 
crease, since the number of pair distances decreases. The 
key is to find when the addition of a community does not 
result in a significant decrease of A. This implies that the 
new communities added are not significantly rigid. This 
is observed in the plateaux in A that follow the 4-way 
and 18-way community structures and is consistent with 
the extended time scales of prevalence for both partitions 
in the stability curve. This indicates that the 4-way and 
18-way community structures are a reasonable compro- 
mise between simplicity and predictive power for rigidity. 
We remark for this particular example that the 'Markov 
time' is defined as an abstract entity, not to be assigned 
an immediate link with a physical quantity. The rigorous 
connection between the Markov time and the biophysical 
time of protein motions is currently being pursued. 



IV. DISCUSSION AND FUTURE WORK 



to simplify the complex dynamics of large biomolecules 
such as proteins. The idea is to obtain a simplified, lower- 
dimensional mechanical description of the movement of 
the protein in terms of a few relatively rigid parts con- 
nected by fiexible elements [531 il [25 , 26, 27J. Be- 
cause rigid parts are likely to form a tightly-knit network 
of chemical bonds and chemical constraints, while being 
loosely interconnected to each other, we expect that a 
reasonable approximation to the constrained fiexibility of 
the protein will be given by the partition of the structural 
graph of the protein with atoms as vertices and edges cor- 
responding to bonds and chemical constraints [25]. 

Figure |4]A shows the time hierarchy of partitions of 
a full atom {N = 2085) structural graph of the pro- 
tein Adenylate Kinase (AK) in its open configuration. 
In this example, biophysical considerations indicate that 
optimizing modularity over-partitions the graph — the 31 
communities obtained at i = 1 split several rigid struc- 
tural motifs such as /3-sheets and a-helices. We use 
the Shi-Malik divisive algorithm to estimate the stabil- 
ity curve and obtain a hierarchy of coarser structures at 
longer times. Some of the optimal partitions (notably 
those into 18 and 4 communities) prevail over relatively 
long time windows and contain significant biophysical 
features. To make this more precise, we evaluate the rel- 
ative variation in the intra-community positions of the 
Ca carbons of two known functional configurations of 
AK (open vs. closed) for all partitions obtained in our 
study. Figure [4]S shows the intra-community stretch- 
ing for all partitions calculated as follows: calculate all 
pair distances between atoms within each community in 
both configurations of the protein and obtain A, the av- 
erage square variation of those distances over all commu- 
nities. If the communities are completely rigid, the pair 
distances within communities will not change and A = 0. 
The maximum value A = 37 is the average square vari- 
ation for all atoms in the protein (i.e, when we consider 
all of them in one community) . As the number of commu- 



In this work, we have introduced the stability (|3| as 
a quality measure of a graph partition. The stability of 
a partition is defined in terms of the autocovariance of 
a Markov process taking place on the clustered graph 
and is explicitly dependent on the Markov time, an in- 
trinsic time scale of the network. This allows us to rank 
partitions and establish their relevance over each time 
scale. Although Markov chains [28^ 29, 3Ili and dynami- 
cal behaviors based on oscillator dynamics [2T| [31] have 
been used in relation to community detection, previous 
methods have not considered the definition of a quality 
measure, nor have they introduced the concept of paths 
of different lengths to evaluate the quality of partitions 
across time scales. 

The resulting sequence of partitions with maximum 
stability as a function of time leads to a time hierarchy 
of clusterings, from finer to coarser as the Markov time 
grows. This hierarchy can be used to establish the most 
relevant partitions over the significant time scales under- 
lying a process. Hence, our method does not provide 
a unique partition for the graph. Rather, we propose 
that, obtaining the distinct partitions which are valid 
over different time windows and selecting those partitions 
that are relevant over extended time scales may be better 
suited for many applications. In particular, if a network 
has been obtained from an underlying dynamical process 
with well defined time scales, our analysis can suggest re- 
duced representations valid over time windows of interest 
in the process. On the other hand, if the network under 
study does not have an obvious temporal interpretation, 
the Markov time acts effectively as an intrinsic resolution 
parameter for the partitions. 

Another important feature of the stability is that it 
gives a unified interpretation in terms of time scales of 
community detection methodologies that have been hith- 
erto considered separately. We have shown that modular- 
ity, cut and normalized cut can be understood in relation 
to the stability at i = 1, while spectral clustering based 
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FIG. 4: Analysis of the atomic-level structural graph of the protein Adenylate Kinase (AK) with = 2085 vertices. (See the 
Supplementary information for a detailed explanation on how this graph is obtained.) (A) The optimal stability curve for this 
graph is estimated by the divisive Shi-Malik algorithm, where the dashed lines are the stability curves of the different partitions 
and the solid curve is the maximum of all dashed curves at each Markov time. The 31-way clustering with optimal modularity 
among the computed clusterings over-partitions the structure: it breaks /3-sheets and a-helices, which should belong to the 
same cluster. The 4-way and 18-way partitions have relatively long windows of stability with a good balance between over- 
and under-partitioning (B) Evaluation of the validity of the partitions obtained through a comparison of two experimental 
conformations of AK (open and closed). Each partition is obtained exclusively from the graph of the open configuration. The 
partitions are then evaluated against the experimental conformational distortions to calculate the error obtained by assuming 
rigidity of the predicted communities. Two plateaux are observed in the error: from 4 to 10 clusters and from 18 to 31 
clusters. This indicates that the 4-way and 18-way partitions (which show persistence over long time windows in (A )) represent 
a parsimonious compromise between rigidity prediction and a small number of clusters. (C) Some of the partitions in the 
hierarchy of the system are represented. Note the structural communities (represented by adjacent regions of the same color) 
appearing at different Markov time scales. 



on the normalized Fiedler vector is linked to stability at 
t = oo. In addition, stability is connected to the concept 
of 'anti-clustering' and fc-colourings [32', 'SS^ based on the 
existence of recurrence patterns in the time-dependence 
of the trace of Rt- Although our stability measure (|3| is 
defined in the discrete time setting, there is an equiva- 
lent continuous-time version of stability (also introduced 
above). This continuous stability can be linked to previ- 
ous numerical results where dynamic outcomes, such as 
synchronization, have been used as heuristics for graph 
partitioning [TH]. The continuous stability can also be 
exploited to analyze the regime beyond the resolution 
limit of modularity to obtain partitions finer than those 
obtained by modularity. In fact, one can show that pre- 
viously proposed ad hoc multi-resolution measures [lOj 
can be interpreted in terms of a linearization of the con- 
tinuous stability at small times. 

Complex systems, from protein dynamics to metabolic 
and social interactions to the internet, are often described 
as networks. The methodology presented here, which ex- 
tends seamlessly to undirected weighted graphs, uses the 
intimate connection between structure and dynamics to 
identify communities that can be revealing of the network 



structure. In some cases, the original networks are static 
and our dynamical approach is a convenient construct to 
reveal the intrinsic resolution scales of the problem. If the 
network has a dynamic origin, or indeed it can be related 
to a Markov process [251 113 1 the analysis of the stabil- 
ity of the resulting graph provides information about the 
hierarchy of time scales of the underlying landscape of 
the system. From this dynamic viewpoint, the presence 
of communities relevant over particular time scales hints 
at a first step towards reduced representations in which 
the communities can be lumped into aggregate variables. 
The extension of this methodology to test systematically 
for reduced models or model reduction schemes will be 
the object of further research. 
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